Search CORE

10 research outputs found

Mel cepstral coefficient modification based on the Glimpse Proportion measure for improving the intelligibility of HMM-generated synthetic speech in noise

Author: King S.
Valentini-Botinhao C.
Yamagishi J.
Publication venue
Publication date: 01/09/2012
Field of study

Edinburgh Research Explorer

Using an intelligibility measure to create noise robust cepstral coefficients for HMM-based speech synthesis

Author: King S.
Valentini-Botinhao C.
Yamagishi J.
Publication venue
Publication date: 01/05/2012
Field of study

Edinburgh Research Explorer

Combining perceptually-motivated spectral shaping with loudness and duration modification for intelligibility enhancement of HMM-based synthetic speech in noise

Author: King S.
Stylianou Y.
Valentini-Botinhao C.
Yamagishi J.
Publication venue
Publication date: 01/08/2013
Field of study

Edinburgh Research Explorer

Cepstral analysis based on the Glimpse proportion measure for improving the intelligibility of HMM-based synthetic speech in noise

Author: King S.
Maia R.
Valentini-Botinhao C.
Yamagishi J.
Zen H.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2012
Field of study

In this paper we introduce a new cepstral coefficient extraction method based on an intelligibility measure for speech in noise, the Glimpse Proportion measure. This new method aims to increase the intelligibility of speech in noise by modifying the clean speech, and has applications in scenarios such as public announcement and car navigation systems. We first explain how the Glimpse Proportion measure operates and further show how we approximated it to integrate it into an existing spectral envelope parameter extraction method commonly used in the HMM-based speech synthesis framework. We then demonstrate how this new method changes the modelled spectrum according to the characteristics of the noise and show results for a listening test with vocoded and HMM-based synthetic speech. The test indicates that the proposed method can significantly improve intelligibility of synthetic speech in speech shaped noise. Index Terms — cepstral coefficient extraction, objective measure for speech intelligibility, Lombard speech, HMM-based speech synthesis 1

CiteSeerX

Crossref

Edinburgh Research Explorer

Deep neural networks employing multi-task learning and stacked bottleneck features for speech synthesis.

Author: King S.
Valentini-Botinhao C.
Watts O.
Wu Z.
Publication venue
Publication date: 01/01/2015
Field of study

Deep neural networks (DNNs) use a cascade of hidden representa-tions to enable the learning of complex mappings from input to out-put features. They are able to learn the complex mapping from text-based linguistic features to speech acoustic features, and so perform text-to-speech synthesis. Recent results suggest that DNNs can pro-duce more natural synthetic speech than conventional HMM-based statistical parametric systems. In this paper, we show that the hidden representation used within a DNN can be improved through the use of Multi-Task Learning, and that stacking multiple frames of hid-den layer activations (stacked bottleneck features) also leads to im-provements. Experimental results confirmed the effectiveness of the proposed methods, and in listening tests we find that stacked bottle-neck features in particular offer a significant improvement over both a baseline DNN and a benchmark HMM system. Index Terms — Speech synthesis, acoustic model, multi-task learning, deep neural network, bottleneck featur

CiteSeerX

Edinburgh Research Explorer

Using linguistic predictability and the Lombard effect to increase the intelligibility of synthetic speech in noise

Author: Valentini-Botinhao C.
Wester M.
Publication venue
Publication date: 01/09/2014
Field of study

Edinburgh Research Explorer

Investiční životní pojištění versus otevřené podílové fondy

Author: Godoy E.
King S.
Sauert B.
Stylianou Y.
Valentini-Botinhao C.
Yamagishi J.
Publication venue
Publication date: 01/01/2013
Field of study

The aim of the thesis is to compare investment life insurance and collective investment funds according to given criteria and to establish which is more prosperous to a client. The literary survey focuses on research papers a professional studies of the issue. The practical thesis devotes to particular products and their comparision using model clients' examples

CiteSeerX

Crossref

Edinburgh Research Explorer

Publikationsserver der RWTH Aachen University

National Repository of Grey Literature

Akustische automatische Erkennung von Sigmatismus bei Kindern

Author: Bocklet T
Degekolb-Weyers S
Eysholdt U
Nöth E
Valentini Botinhao C
Publication venue: German Medical Science GMS Publishing House; Düsseldorf
Publication date: 18/08/2011
Field of study

Hintergrund: Sigmatismus tritt auf, wenn die Zunge während der Artikulation von Zischlauten wie /s/ und /z/ falsch positioniert ist. Wir stellen hier eine automatische rechnergestützte Sigmatismusdetektion vor, die entscheiden kann, ob die Zischlaute eines Kindes korrekt ausgesprochen wurden.Material und Methoden: Die Daten stammen von 39 Jugendlichen, die drei unterschiedliche Arten von Sigmatismus simulieren. Zusätzlich liegen Aufnahmen von 10 Kindern vor, von denen 5 tatsächlich an Sigmatismus leiden. Das System beruht auf unterschiedlichen Merkmalen, die direkt auf den jeweiligen Sprachaufnahmen berechnet werden: Energieanalysen der einzelnen Zischlauten, Mel Frequenz Cepstrum Koeffizienten und Meta-Merkmale, die auf Gaußschen Mischverteilungen basieren.Ergebnisse: Beide Datensätze wurden auf Laut-, Word- und Sprecherebene analysiert. Auf den simulierten Daten erreichte das beste System Erkennungsergebnisse von 86% auf Lautebene, 87% auf Wortebene und 94% auf Sprecherebene. Dieses System wurde mit Erfolg an den Sigmatismus-Kindern getestet.Diskussion: Die beste Konfiguration wurde in ein Biofeedback-System eingebaut, welches zur Therapieunterstützung genutzt werden kann

German Medical Science

Improved Wasserstein conditional generative adversarial network speech enhancement

Author: A Kumar
C Valentini-Botinhao
CC Hsu
F Weninger
I Gulrajani
KFC Yiu
L Sun
S Pascual
Shan Qin
SW Fu
Ting Jiang
Y Bando
Y Xu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref